Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change OpLogFilter::agent_id to be optional #826

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

henry0715-dev
Copy link
Contributor

@henry0715-dev henry0715-dev commented Sep 11, 2024

Close #724

The RocksDB keys in the oplog are currently in the form agent_id-0-timestamp. This change modifies the keys to the form timestamp-sequence.

Copy link

codecov bot commented Sep 11, 2024

Codecov Report

Attention: Patch coverage is 89.36937% with 59 lines in your changes missing coverage. Please review.

Project coverage is 77.55%. Comparing base (2d2c6b8) to head (19d4409).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/ingest.rs 15.15% 28 Missing ⚠️
src/graphql.rs 92.41% 11 Missing ⚠️
src/storage/migration.rs 92.70% 7 Missing ⚠️
src/ingest/generation.rs 77.77% 6 Missing ⚠️
src/storage.rs 89.36% 5 Missing ⚠️
src/graphql/log.rs 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #826      +/-   ##
==========================================
+ Coverage   77.01%   77.55%   +0.53%     
==========================================
  Files          31       32       +1     
  Lines       25569    26061     +492     
==========================================
+ Hits        19692    20211     +519     
+ Misses       5877     5850      -27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@henry0715-dev henry0715-dev changed the title [Draft] Temp commit for OpLogFilter development direction Change OpLogFilter::agent_id to be optional Nov 4, 2024
@henry0715-dev henry0715-dev marked this pull request as ready for review November 4, 2024 08:55
@henry0715-dev henry0715-dev force-pushed the henry/op-log-filter branch 2 times, most recently from 6b211c0 to be77ab4 Compare November 5, 2024 03:53
@sophie-cluml sophie-cluml self-requested a review November 7, 2024 06:39
src/storage/migration.rs Outdated Show resolved Hide resolved
src/ingest/generation.rs Outdated Show resolved Hide resolved
let mut last_reset_day = self
.last_reset_date
.lock()
.expect("last_reset_date should be exist.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions regarding this line:

  • Could you run grammar check for the expect message please?
  • Could you help me understand the choice of expect here please ? To me it seems like we need to use something like panic! according to our guideline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the source code from Mutex to RwLock as below.
Therefore, there seems to be no need to discuss expect and panic!.

    pub struct SequenceGenerator {
        counter: AtomicUsize,
        last_reset_date: RwLock<u32>,
    }

    pub(crate) async fn generate_sequence_number(&self) -> usize {
        let current_date_time = SequenceGenerator::get_current_date_time();
        {
            let last_reset_day = self.last_reset_date.read().await;
            if *last_reset_day == current_date_time {
                return self.counter.fetch_add(1, Ordering::Relaxed);
            }
        }
        {
            let mut last_reset_day = self.last_reset_date.write().await;
            self.counter.store(1, Ordering::Release);
            *last_reset_day = current_date_time;
        }
        self.counter.fetch_add(1, Ordering::Relaxed)
    }

src/storage/migration/migration_structures.rs Outdated Show resolved Hide resolved
src/ingest.rs Outdated Show resolved Hide resolved
@henry0715-dev henry0715-dev force-pushed the henry/op-log-filter branch 2 times, most recently from e67329d to 97ce434 Compare November 14, 2024 05:34
@sehkone sehkone self-requested a review November 15, 2024 05:13
@henry0715-dev henry0715-dev force-pushed the henry/op-log-filter branch 2 times, most recently from ff0dcae to bf40057 Compare November 15, 2024 09:03
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@sophie-cluml sophie-cluml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the version in Cargo.toml file should be updated to have suffix '-alpha.2'.

@henry0715-dev
Copy link
Contributor Author

I think the version in Cargo.toml file should be updated to have suffix '-alpha.2'.

It's done.
Thanks you.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@henry0715-dev
Copy link
Contributor Author

@sophie-cluml
I would like to ask if this PR is included in release version 0.23.0.

@sophie-cluml
Copy link
Contributor

@henry0715-dev, I believe there are still a few areas that need to be addressed in the current PR, and with the demand for the upcoming release, I'm afraid it may not be feasible to include this PR in the 0.23.0 release. I apologize for any inconvenience, and I appreciate your understanding.

@henry0715-dev
Copy link
Contributor Author

@sophie-cluml
You want to work by modifying the version of that PR to 0.24.0-alpha.1.
If you need to set up another version, please let me know.

@sophie-cluml
Copy link
Contributor

sophie-cluml commented Nov 22, 2024

As a reply to #826 (comment), I would appreciate if you could update the version to 0.24.0-alpha.1 for this PR and make consecutive changes.

@henry0715-dev
Copy link
Contributor Author

As a reply to #826 (comment), I would appreciate if you could update the version to 0.24.0-alpha.1 for this PR and make consecutive changes.

It's done.
Thanks you.

src/storage.rs Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@sophie-cluml
Copy link
Contributor

I would like to merge this PR after @henry0715-dev resolves a conflict in the CHANGELOG. As far as I know, David and Kone are interested in this PR, so I'd like to ask @kimhanbeom and @sehkone if there are any remaining concerns, so that I need to put hold on merging this PR.

@msk
Copy link
Contributor

msk commented Nov 27, 2024

I noticed the b'0' separator is still used in the new key format. While it was necessary in the current format to resolve sorting issues, it seems redundant in the new format, where the fixed-size timestamp ensures proper lexicographical order.

To illustrate, in the current format:

  • Two agent IDs, 0x77 and 0x7766, with timestamps starting at 0x99, would produce keys:
    • 0x770099...
    • 0x77660099...
      Without the b'0', they become 0x7799... and 0x776699..., causing incorrect sorting (0x776699 comes before 0x7799).

In the new format, this problem doesn’t arise because the fixed-size timestamp ensures keys like:

  • 0x990077...
  • 0x99007766...
    which sort correctly even without b'0'.

Removing b'0' can simplify the format and reduce overhead.

@henry0715-dev
Copy link
Contributor Author

Removing b'0' can simplify the format and reduce overhead.

I applied the content to the source code.
Thank you.

src/storage.rs Outdated

pub fn mid_key(mut self, key: usize) -> Self {
let mid_key = key.to_be_bytes();
self.pre_key.reserve(mid_key.len() + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need + 1 here, because we are not padding the key with b'0' anymore, and the same for the other relevant places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally
I modified the following in the lower_closed_bound_start_key, upper_open_bound_start_key, and upper_closed_bound_start_key functions.

as-is
self.pre_key.reserve(TIMESTAMP_SIZE+1);
to-be
self.pre_key.reserve(TIMESTAMP_SIZE);

Close #724

The RocksDB keys in the `oplog` are currently in the form agent_id-0-timestamp.
This change modifies the keys to the form timestamp-sequence.
@sophie-cluml sophie-cluml merged commit 07b5a3c into main Dec 2, 2024
10 checks passed
@sophie-cluml sophie-cluml deleted the henry/op-log-filter branch December 2, 2024 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change OpLogFilter::agent_id to be optional
4 participants